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Abstract 

Motivated by several applications, we consider the problem of randomly rounding a fractional solution 
in a matroid (base) polytope to an integral one. We consider the pipage rounding technique [5 6 36 1 and 
also present a new technique, randomized swap rounding. Our main technical results are concentration 
bounds for functions of random variables arising from these rounding techniques. We prove Chernoff- 
type concentration bounds for linear functions of random variables arising from both techniques, and also 
a lower-tail exponential bound for monotone submodular functions of variables arising from randomized 
swap rounding. 

The following are examples of our applications. 

• We give a (1 — 1/e— e)-approximation algorithm for the problem of maximizing a monotone submod- 
ular function subject to 1 matroid and k linear constraints, for any constant k > 1 and e > 0. We also 
give the same result for a super-constant number k of "loose" linear constraints, where the right-hand 
side dominates the matrix entries by an fl(e~ 2 log k) factor. 

• We present a result on minimax packing problems that involve a matroid base constraint. We give 
an 0(log 77i / loglogm)-approximation for the general problem min{A : 3x S {0,11^,2; £ B(A4), 
Ax < Xb} where m is the number of packing constraints. Examples include the low-congestion 
multi-path routing problem ||34l and spanning-tree problems with capacity constraints on cuts ll4l [T6ll . 

• We generalize the continuous greedy algorithm 11351 to problems involving multiple submodular 
functions, and use it to find a (1 — 1/e — e)-approximate pareto set for the problem of maximizing 
a constant number of monotone submodular functions subject to a matroid constraint. An example is 
the Submodular Welfare Problem where we are looking for an approximate pareto set with respect to 
individual players' utilities. 



*Dept. of Computer Science, Univ. of Illinois, Urbana, IL 61801. Partially supported by NSF grant CCF-0728782. E-mail: 
chekuri @cs .illinois.edu 

^IBM Almaden Research Center, San Jose, CA 95120. E-mail: j vondrak@us . ibm . com 

"'"Institute for Operations Research, ETH Zurich. E-mail: rico . zenklusen@if or . math . ethz . ch 



1 Introduction 



Randomized rounding is a fundamental technique introduced by Raghavan and Thompson |29l in order to 
round a fractional solution of an LP into an integral solution. Numerous applications and variants have since 
been explored and it is a standard technique in the design of approximation algorithms and related areas. The 
original technique from ||29l (and several subsequent papers) relies on independent rounding of the variables 
which allows one to use Chernoff-Hoeffding concentration bounds for linear functions of the variables; these 
bounds are critical for several applications in packing and covering problems. However, there are many situa- 
tions in which independent rounding is not feasible due to the presence of constraints that cannot be violated by 
the rounded solution. Various techniques are used to handle such scenarios. To name just a few: alteration of 
solutions obtained by independent rounding, careful derandomization or constructive methods when probability 
of a feasible solution is non-zero but small (for example when using the Lovasz Local Lemma), and various 
forms of correlated or dependent randomized rounding schemes. These methods are typically successful when 
one is interested in preserving the expected value of the sum of several random variables; the rounding schemes 
approximately preserve the expected value of each random variable and then one relies on linearity of expecta- 
tion for the sum. There are, however, applications where one cannot use independent rounding and nevertheless 
one needs concentration bounds and/or the ability to handle non-linear objective functions such as convex or 
submodular functions of the variables; the work of Srinivasan [34 ] and others lfl4l[T9ll highlights some of these 
applications. Our focus in this paper is on such schemes. In particular we consider the problem of rounding 
a point in a matroid polytope to a vertex. We compare the existing approaches and propose a new rounding 
scheme which is simple and has multiple applications. 

Background: Matroid polytopes, whose study was initiated by Edmonds in the 70's, form one of the most 
important classes of polytopes associated with combinatorial optimization problems. (For a definition, see 
Section[2]) Even though the full description of a matroid polytope is exponentially large, matroid polytopes can 
be optimized over, separated over, and they have strong integrality properties such as total dual integrality. As 
a consequence, the basic solution of a linear optimization problem over a matroid polytope is always integral 
and no rounding is necessary. 

More recently, various applications emerged where a matroid constraint appears with additional constraints 
and/or the objective function is non-linear. In such cases, the issue of rounding a fractional solution in the 
matroid polytope re-appears as a non-trivial question. One such application is the submodular welfare prob- 
lem fl2l l22l . which can be formulated as a submodular maximization problem subject to a partition matroid 
constraint. The rounding technique that turned out to be useful in this context is pipage rounding Q. 

Pipage rounding was introduced by Ageev and Sviridenko O, who used it for rounding fractional solu- 
tions in the bipartite matching polytope. They used a linear program to obtain a fractional solution to a certain 
problem, but the rounding procedure was based on an auxiliary (non-linear) objective. The auxiliary objective 
F(x) was defined in such a way that F(x) would always increase or stay constant throughout the rounding 
procedure. A comparison between F(x) and the original objective yields an approximation guarantee. Cali- 
nescu et al. (21 adapted the pipage rounding technique to problems involving a matroid constraint rather than 
bipartite matchings. Moreover, they showed that the necessary convexity properties are satisfied whenever 
the auxiliary function F(x) is a multilinear extension of a submodular set function f. This turned out to be 
crucial for further developments on submodular maximization problems - in particular an optimal (1 — 1/e)- 
approximation for maximizing a monotone submodular function subject to a matroid constraint Il35l l6l. and a 
(1 — 1/e — e)-approximation for maximizing a monotone submodular function subject to a constant number of 
linear constraints [18 ]. As one of our applications, we consider a common generalization of these two problems. 

Srinivasan 041 . and building on his work Gandhi et al. lPT4l . considered dependent randomized rounding 
for points in the bipartite matching polytope (and more generally the assignment polytope); their technique can 
be viewed as a randomized (and oblivious) version of pipage rounding. The motivation for this randomized 
scheme came from a different set of applications (see [34]). The results in 041 [141 showed negative correlation 
properties for their rounding scheme which implied concentration bounds (via [28]) that were then useful in 
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dealing with additional constraints. We make some observations regarding the results and applications in Il3ll34l 
[141 . Although the schemes round a point in the assignment polytope, each constraint and objective function is 
restricted to depend on a subset of the edges incident to some vertex in the underlying bipartite graph. Further, 
several of the applications in [3l [34J, [141 can be naturally modeled via a matroid constraint instead of using a 
bipartite graph with the above mentioned restriction; in fact the simple partition matroid suffices. 

The pipage rounding technique for matroids, as presented in [5], is a deterministic procedure. However, 
it can be randomized similarly to Srinivasan's work [34], and this is the variant presented in (6). This variant 
starts with a fractional solution in the matroid base polytope, y G B(Ai), and produces a random base B G M. 
such that E[f(B)] > F(y); here F is the multilinear extension of the submodular function /. A further 
rounding stage is needed in case the starting point is inside the matroid polytope P(M.) rather than the matroid 
base polytope B{M); pipage rounding has been extended to this case in ll36l . In the analysis of ll6l l36l. 
the approximation guarantees are only in expectation. Stronger guarantees could be obtained and additional 
applications would arise if we could prove concentration bounds on the value of linear/submodular functions 
under such a rounding procedure. This is the focus of this paper. 

Very recently, another application has emerged where rounding in a matroid polytope plays an essential 
role. Asadpour et al. present a new approach to the Asymmetric Traveling Salesman problem achieving an 
0(log n/ log log n) -approximation, improving upon the long-standing 0(log n)-approximation. A crucial step 
in the algorithm is a rounding procedure, which given a fractional solution in the spanning tree polytope pro- 
duces a spanning tree satisfying certain additional constraints. The authors of [2 ] use the technique of maximum 
entropy sampling which gives negative correlation properties and Chernoff-type concentration bounds for any 
linear function on the edges of the graph. Since spanning trees are bases in the graphic matroid for any graph, 
this rounding procedure also falls in the framework of randomized rounding in the matroid polytope. However, 
it is not clear whether the technique of (2J can be generalized to any matroid or whether it could be used in 
applications with a submodular objective function. 

1.1 Our work 

In this paper we study the problem of randomly rounding a point in a matroid polytope to a vertex of the 
polytopelJ We consider the technique of randomized pipage rounding and also introduce a new rounding 
procedure called randomized swap rounding. Given a starting point x G P{M), the procedure produces a 
random independent set S G T such that Pr[i G S] = xi for each element i. Our main technical results are 
concentration bounds for linear and submodular functions f(S) under this new rounding. We demonstrate the 
usefulness of these concentration bounds via several applications. 

The randomized swap rounding procedure bears some similarity to pipage rounding and can be used as a 
replacement for pipage rounding in (6l[36l. It can be also used as a replacement for maximum entropy sampling 
in El. However, it has several advantages over previous rounding procedures. It is easy to describe and 
implement, and it is very efficient. Moreover, thanks to the simplicity of randomized swap rounding, we are able 
to derive results that are not known for previous techniques. One example is the tail estimate for submodular 
functions, Theorem 1 1.41 On the other hand, our concentration bound for linear functions (Corollary 1 1 .21 ) holds 
for a more general class of rounding techniques including pipage rounding (see also Lemma |4~TI ). 

Randomized swap rounding starts from an arbitrary representation of a starting point x G P{M.) as a 
convex combination of incidence vectors of independent sets. (This representation can be obtained by standard 
techniques and in some applications it is explicitly available.) Once a convex representation of the starting point 
is obtained, the running time of randomized swap rounding is bounded by 0{nd?) calls to the membership ora- 
cle of the matroid, where d is the rank of the matroid and n is the size of the ground set. In comparison, pipage 
rounding performs 0(n 2 ) iterations each of which requires an expensive call to submodular function minimiza- 
tion (see [6]). Maximum entropy sampling for spanning trees in a graph G = (V, E) is even more complicated; 

1 Our results extend easily to the case of rounding a point in the polytope of an integer valued poly matroid. Additional applications 
may follow from this. 
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l2l does not provide an explicit running time, but it states that the procedure involves 0(|.E| 2 |y| log | V|) itera- 
tions, where in each iteration one needs to compute a determinant (from Kirchhoff 's matrix theorem) for each 
edge. Also, maximum entropy sampling preserves the marginal probabilities Pr[i G S] = Xj only approxi- 
mately, and the running time depends on the desired accuracy. 

First, we show that randomized swap rounding as well as pipage rounding have the property that the indi- 
cator variables Xi = [i G S] have expectations exactly Xj, and are negatively correlated. 

Theorem 1.1. Let (x±, . . . , x n ) G P(M) be a fractional solution in the matroid polytope and (X\, . . . , X n ) G 
{0, l} n an integral solution obtained using either randomized swap rounding or randomized pipage rounding. 
Then E[Xi] = x it and for any T C [n], (i) E[Y\ ieT X { ] < U ieT Xi, (ii) E[H ieT (l - Xi)] < Y[ ieT (l - Xi). 

This yields Chernoff-type concentration bounds for any linear function of X\, . . . , X n , as proved by Pan- 
conesi and Srinivasan [28] (see also Theorem 3.1 in [14J). Together with Theorem ll.ll we obtain: 

Corollary 1.2. Let ai G [0, 1] and X = ^ aj-Xj, where (X±, . . . , X n ) are obtained by either randomized swap 
rounding or randomized pipage rounding from a starting point (x\, . . . , x n ) G P(«M). 

• If5>0andp>E[X] = £ a lXl , then Pr[X > (1 + S)p] < ( ^i+s Y; 
for 5 G [0, 1], the bound can be simplified to Pi[X > (1 + d)p] < e -^ 2 / 3 . 

• If 5 G [0, 1], and n < E[X] = ^ a iXi , then Vt[X < (1 - S)p] < e'^ 2 / 2 . 

In particular, these bounds hold for X = J2ies where S is an arbitrary subset of the variables. We 
remark that in contrast, when randomized pipage rounding is performed on bipartite graphs, negative correlation 
holds only for subsets of edges incident to a fixed vertex [ 14 ]. 

More generally, we consider concentration properties for a monotone submodular function f(R), where R 
is the outcome of randomized rounding. Equivalently, we can also write f(R) = f(X\,X2, . . . , X n ) where 
Xi G {0, 1} is a random variable indicating whether i G S. First, we consider a scenario where X\, . . . , X n are 
independent random variables. We prove that in this case, Chernoff-type bounds hold for f(Xi,X2, ■ ■ ■ , X n ) 
just like they would for a linear function. 

Theorem 1.3. Let f : {0, l} n — > R+ be a monotone submodular function with marginal values in [0, 1]. Let 
X±, . . . , X n be independent random variables in {0, 1}. Let p, = E[/(Xi, X2, • • • , X n )]. Then for any 5 > 0, 

. Pr[/(Xi, ...,Z„)>(1 + 6)i4 < (jTTW^Y ■ 

• Pr[f(X h ...,X n ) < (1-S)p] <e"^ 2 /2. 

We remark that Theorem [T3] can be used to simplify previous results for submodular maximization under 
linear constraints, where variables are rounded independently |fl8l . Furthermore, we prove a lower-tail bound 
in the dependent rounding case, where X\ , . . . , X n are produced by randomized swap rounding. 

Theorem 1.4. Let f(S) be a monotone submodular function with marginal values in [0,1], and F{x) = 
E[f{x)] its multilinear extension. Let (xi, . . . , x n ) G P(M) be a point in a matroid polytope and R a random 
independent set obtained from it by randomized swap rounding. Let p$ = F{x\, . . . ,x n ) and 5 > 0. Then 
E [/(#)] > Mo and 

Pr[f(R) < (1 - 6)»o] < e-^ 2 '\ 

We do not know how to derive this result using only the property of negative correlations; in particular, 
we do not have a proof for pipage rounding, although we suspect that a similar tail estimate holds. (Weaker 
tail estimates involving a dependence on n follow directly from martingale concentration bounds; the main 
difficulty here is to obtain a bound which does not depend on n.) We remark that the tail estimate is with 
respect to the value of the starting point, pq = F(x±, . . . ,x n ), rather than the actual expectation of f(R), 
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which could be larger (it would be equal for a linear function /, or under independent rounding). For this 
reason, we do not have an upper tail bound. However, /io is the value that we want to achieve in applications 
and hence this is the bound that we need. 

Applications: We next discuss several applications of our rounding scheme. While some of the applications 
are concrete, others are couched in a general framework; specific instantiations lead to various applications 
new and old, and we defer some of these to a later version of the paper. Our rounding procedure can be used 
to improve the running time of some previous applications of pipage rounding (6j [36j and maximum entropy 
sampling [2]. In particular, our technique significantly simplifies the algorithm and analysis in the recent 
0(logn/ log log n) -approximation for the Asymmetric Traveling Salesman problem 0. In other applications, 
we obtain approximations with high probability instead of in expectation [0136]]. Details of these improvements 
are deferred. Our new applications are as follows. 

Submodular maximization subject to 1 matroid and k linear constraints. Given a monotone submodular func- 
tion / : 2 N — > E + , a matroid M. on the same ground set N, and a system of k linear packing constraints 
Ax < b, we consider the following problem: max{/(x) : x G P(M.),Ax < b, x G {0, 1}™}. This problem 
is a common generalization of two previously studied problems, monotone submodular maximization subject 
to a matroid constraint [6] and subject to a constant number of linear constraints |[T8l . For any fixed e > and 
k > 0, we obtain a (1 — 1/e — e) -approximation for this problem, which is optimal up to the arbitrarily small 
e (even for 1 matroid or 1 linear constraint f25l [Till "), and generalizes the previously known results in the two 
special cases. We also obtain a (1 — 1/e — e) -approximation when the constraints are sufficiently "loose"; that 
is b{ > £l(e~ 2 log k) ■ Aij for all 

Minimax Integer Programs subject to a matroid constraint. Let M be a matroid on a ground set N (let n = \N\). 
Let B(J\A) be the base polytope of M.. We consider the problem min{A : Ax < Xb, x G B(M), x G {0, 1}™} 
where A G M™ xn and b G . We give an 0(log m/ log log m)-approximation for this problem, and a similar 
result for the min-cost version (with given packing constraints and element costs). This generalizes earlier 
results on minimax integer programs which were considered in the context of routing and partitioning problems 
|[29ll23l[33l[34l[T4 ]: the underlying matroid in these settings is the partition matroid. Another application fitting 
in this framework is the minimum crossing spanning tree problem and its geometric variant, the minimum 
stabbing spanning tree problem. We elaborate on these in Section [6l 

Multiobjective optimization with submodular functions. Suppose we are given a matroid J\A = (N,T) and a 
constant number of monotone submodular functions fi, ■ ■ ■ , fk '■ ^ N — ► R+- Given a set of "target values" 
Vi, . . . , Vfc, we either find a certificate that there is no solution 5£l such that fi(S) > V\ for all i, or we find 
a solution S such that fi(S) > (1 — 1/e — e)Vi for all i. Using the framework of multiobjective optimization 
[27], this implies that we can find efficiently a (1 — 1/e — e) -approximate pareto curve for the problem of 
maximizing k monotone submodular functions subject to a matroid constraint. A natural special case of this 
is the Submodular Welfare problem, where each objective function f%{S) represents the utility of player i. 
I.e., we can find a (1 — 1/e — e)-approximate pareto curve with respect to the utilities of the k players (for 
k constant). This result involves a new variant of the continuous greedy algorithm from 11351 . which in some 
sense optimizes multiple submodular functions at the same time. With linear objective functions /j, we obtain 
the same guarantees with 1 — e instead of 1 — 1/e — e. We give more details in Section [7] 

Organization: In Section [2l we present the necessary definitions. In Section [3] the randomized swap 
rounding procedure is introduced. In Sectional we prove a negative correlation property for a class of rounding 
procedures including randomized swap rounding and pipage rounding. In Section \5\ we present our algorithm 
for maximizing a monotone submodular function subject to 1 matroid and k linear constraints. In Section [6l 
we present our results on minimax integer programs. In Section |7j we present our results on multiobjective 
optimization. In Appendix [A] we give a complete description of randomized pipage rounding. In Appendix iBl 
we present a generalization of swap rounding for rounding points in the matroid polytope rather than the base 
polytope. In Appendix we present our concentration bounds for submodular functions under independent 
rounding, and in Appendix [D] our lower-tail bound under randomized swap rounding. 
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2 Preliminaries 



Matroid polytopes. Given a matroid M. = (N, T) with rank function r : 2 N — » Z + , two polytopes associated 
with are the matroid polytope P{M.) and the matroid base polytope B{M) (see also 11301 ). P(M.) is 
the convex hull of characteristic vectors of the independent sets of M.. 

P{M) =conv{l/ :/Gl} = {x>0:VS;^x ! < r(S)} 

i&S 

B(M) is the convex hull of the characteristic vectors of the bases B of M. , i.e. independent sets of maximum 
cardinality. 

B(M) =conv{l B :BeB} = P{M)n{x : ^ x { = r(N)}. 

ieN 

Matroid exchange properties. To simplify notation, we use + and — for the addition and deletion of single 
elements from a set, for example S — i + j denotes the set (S \ {i}) U {j}. The following base exchange 
property of matroids is crucial in the design of our rounding algorithm. 

Theorem 2.1. Let M = (N, I) be a matroid and let B\,B<i G B. For any i £ Bi\5a there exists j ' G B2 \ B\ 
such that B\ — i + j G B and B2 — j + i G B. 

To find an element j that corresponds to a given element i as described in the above theorem, one can simply 
check all elements in B2\B\. Thus a corresponding element j can be found by 0{d) calls to an independence 
oracle, where d is the rank of the matroid. For many matroids, a corresponding element j can be found faster. 
In particular, for the graphic matroid, j can be chosen to be any element 7^ i that lies simultaneously in the cut 
defined by the connected components of B\ — i and in the unique cycle in B2 + i. 

Submodular functions. A function / : 2 N -> R is submodular if for any A, B C N, f(A) + f(B) > 
f(A U B) + f(A n B). In addition, / is monotone if f(S) < f(T) whenever S C T. We denote by 
/a(^) = f(A + i) — f(A) the marginal value of i with respect to A. An important concept in recent work on 
submodular functions O [35] [6l [T8l [20l [36j is the multilinear extension of a submodular function: 

F(x)=E[f(x)]= j2mU x i n c 1 -^)- 

SCN i£S i£N\S 

Rounding in the matroid polytope. A rounding procedure takes a point in the matroid polytope x € P(M.) 
and rounds it to an independent set R € 1. In its randomized version, it is oblivious to any objective function 
and produces a random independent set, with a distribution depending only on the starting point x G P{M). If 
the starting point is in the matroid base polytope B(J\A), the rounded solution is a (random) base of M.. 

One candidate for such a rounding procedure is pipage rounding (6l[36l. We give a complete description 
of the pipage rounding technique in the appendix. In particular, this rounding satisfies that Pr[i G R] = x% for 
each element i, and E[f(R)] > F(x) for any submodular function / and its multilinear extension F. Our new 
rounding, which is described in Section [3l satisfies the same properties and has additional advantages. 

3 Randomized swap rounding 

Let M. = (N, T) be a matroid of rank d = r(N) and let n = \N\. Randomized swap rounding is a randomized 
procedure that rounds a point x G P(M.) to an independent set. We present the procedure for points in the base 
polytope. It can easily be generalized to round any point in the matroid polytope (see Appendix IB. 21 ). 

Assume that x G B(M.) is the point we want to round. The procedure needs a representation of x as a 
convex combination of bases, i.e., x = X^^Li Pe^-B e with YlJLi Pe = 1, fie > 0. Notice that by Caratheodory's 
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theorem there exists such a convex representation using at most n bases. In some applications, the vector x 
comes along with a convex representation. Otherwise, it is well-known that one can find such a convex repre- 
sentation in polynomial time using the fact that one can separate (or equivalently optimize) over the polytope 
in polynomial time (see for example ED). For matroid polytopes, Cunningham f8j proposed a combinatorial 
algorithm that allows to find a convex representation of x G B{M.) using at most n bases and whose runtime is 
bounded by 0(n 6 ) calls to an independence oracle. In special cases, faster algorithms are known; for example 
any point in the spanning tree polytope of a graph G = (V, E) can be decomposed into a convex combination 
of spanning trees in 0(| 1^| 3 | ) time fl3l . In general this would be the dominating term in the running time of 
randomized swap rounding. 

Given a convex combination of bases x = Y^t=\ Pe^B e , the procedure takes 0{nd 2 ) calls to a matroid 
independence oracle. The rounding proceeds in n — 1 stages, where in the first stage we merge the bases B\ , B2 
(randomly) into a new base C2, and replace I3\1b 1 + /?2l_B 2 i* 1 tne nnear combination by {(5\ + /?2)lc 2 - m tne 
/c-th stage, C k and B k+ i are merged into a new base C k +i, and (Yle=i flt)lc k + Pk+i^-B k+1 is replaced in the 



linear combination by (X^=i Pe)^c k+1 - 
lc n , and the base C n is returned. 



After n — 1 stages, we obtain a linear combination (X)"=i /%)lc„ 



The procedure we use to merge two bases, 
called MergeBases, takes as input two bases B\ 
and B2 and two positive scalars f3\ and /?2- It is 
described in the adjacent figure. Notice that the 
procedure relies heavily on the basis exchange 
property given by Theorem 12. II to guarantee the 
existence of the elements j in the while loop. As 
discussed in Section j can be found by check- 
ing all elements in B% \ B\. Furthermore, since 
the cardinality of B\ \ B2 decreases at each iteration by one, the total number of iterations is bounded by 
l-Bil = d. 



Algorithm MergeBases B\, fc, B2): 
While (B 1 ^ B 2 ) do 
Pick i G B\ \ B% and find j G B 2 \ B 1 such that 

Bi - i + j € T and B 2 - j + i G X; 
With probability (3i/{f3i + 2 ), {B 2 <- B 2 - j + i}; 

Else {£1 <- B 1 - i + j}; 

EndWhile 
Output B 1 . 



Algorithm SwapRound(x = Y17=i PelBi)'- 
C 1 = B 1 ; 

For (k = 1 to n — 1) do 

C k+ i =MergeBases(X;£ = i Pt, Ck, (3 k+ i,B k+1 ); 
EndFor 
Output C n . 



The main algorithm SwapRound is described in 
the figure. It uses MergeBases to repeatedly merge 
bases in the convex decomposition of x. For further 
analysis we present a different viewpoint on the al- 
gorithm, namely as a random process in the matroid 
base polytope. This also allows us to present the al- 
gorithm in a common framework with pipage round- 
ing and to draw parallels between the approaches more easily. 

We denote by an elementary operation of the swap rounding algorithm one iteration of the while loop in the 
MergeBases procedure, which is repeatedly called in SwapRound. Hence, an elementary operation changes 
two components in one of the bases used in the convex representation of the current point. For example, if 
the first elementary operation transforms the base B\ into B[ , then this can be interpreted on the matroid base 
polytope as transforming the point x = Yle=i Pl^-B^ into /3i 1 ^/ + Yle=2 Pt^-Bf Hence, the SwapRound 
algorithm can be seen as a sequence of dn elementary operations leading to a random sequence Xo , . . . , X T 
where X t denotes the convex combination after t elementary operations. 



4 Negative correlation for dependent rounding procedures 

In this section, we prove a result which shows that the statement of Theorem 11.11 is true for a large class of 
random vector- valued processes that only change at most two components at a time. Theorem 1 1.11 then easily 
follows by observing that randomized swap rounding as well as pipage rounding fall in this class of random 
processes. The proof follows the same lines as lfl4l in the case of bipartite graphs. The intuitive reason for 
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negative correlation is that whenever a pair of variables is being modified, their sum remains constant. Hence, 
knowing that one variable is high can only make the expectation of another variable lower. 

Lemma 4.1. Let t E N and let Xf = (Xi t, . . . , X n) t) for t E {0, . . . ,r} be a non-negative vector-valued 
random process with initial distribution given by X^o = Xi with probability 1 \/i E [n], and satisfying the 
following properties: 

1. E[X t+ i | Xt] = y^tfor t E {0, . . . , r} and i E [n]. 

2. Xt and Xt+i differ in at most two components for t E {0, . . . , r — 1}. 

3. For t E {0, . . . ,t}, if two components i,j E [n] change between Xt and X(+i, then their sum is 
preserved: X ijt +i + X jtt+1 = X i>t + X j>t . 

Then for any t E {0, . . . , r}, the components of Xt satisfy E[JT ig5 , X^t] < Tli<=s x i ^ [ n \- 

Proof. We are interested in the quantity Y t = \\ ie sXi t . At the beginning of the process, we have E[Yo] = 
Wi&s x i- T ne ma i n claim is that for each t, we have E[Y t+ i|X t ] < Y t . 

Let us condition on a particular configuration of variables at time t, X t = (X^, . . . , X n j). We consider 
three cases: 

• If no variable Xi, i E S, is modified in step t, we have Y t+ i = flies X-i,t+i = YiieS Xi,t = Yt- 

• If exactly one variable Xi, i E S, is modified in step t, then by property [T]of the lemma: 

E[Y m | X t ] = EpQ, t+1 | X t ] • J] X,- t = X,- t = Y t . 

jes\{i} j&S 

• If two variables Xi,Xj, i,j E S, are modified in step t, we use the property that their sum is preserved: 
Xi t t+\ + Xj t+i = Xi t t + Xjj. This also implies that 

E[(X M+1 + X,- t+ i) 2 | X t ] = (X i>t + X ht f. (1) 

On the other hand, the value of each variable is preserved in expectation. Applying this to their difference, 
we get E[X it+ i — Xjj+i \ XJ = X,- ht — Xjp Since E[Z 2 ] > (E[Z]) 2 holds for any random variable, 
we get 

E[(Xj jt+ i — Xjj+i) 2 | Xt] > (X^t — Xj 5 t) 2 - (2) 
Combining <Q} and ©, and using the formula XY = \({X + Y) 2 - (X - Y) 2 ), we get 

EfX^t+iX^t+i | Xt] < XifXjp 

Therefore, 

E[Y m | Xt] = E[X M+ iX,-t +1 | Xt] • H x k]t < J] X fc) t = y t , 

fceS\{j,j} fees 

as claimed. By taking expectation over all configurations Xt we obtain E[Yt+i] < E[Y"t]- Consequently, 
E[n ie5 X i)t ] = E[Y t ] < E[Y t -i] < ... < E[Y ] = Hies x ^ as daim ed by the lemma. □ 

Any process that satisfies the conditions of Lemma 14.11 thus also satisfies the first statement of Theo- 
rem 11.11 Furthermore, the second statement of Theorem 11.11 also follows by observing that for any process 
(Xi 5 t, • • • , X„ t) that satisfies the conditions of Lemma |4~T1 also the process (1 — Xi jt , . . . , 1 — X n> t) satis- 
fies the conditions. As we mentioned in Section [T] these results imply strong concentration bounds for linear 
functions of the variables Xi, . . . , X n (Corollary [Oil. 

Both randomized swap rounding and pipage rounding satisfy the conditions of Lemma |4~T1 (proofs can be 
found in the Appendix). This implies Theorem ll.il Note that the sequences X t created by randomized swap 
rounding or pipage rounding - besides satisfying the conditions of Lemma FkTl - are Markovian, and hence they 
are vector-valued martingales. 
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5 Submodular maximization subject to 1 matroid and k linear constraints 



In this section, we present an algorithm for the problem of maximizing a monotone submodular function subject 
to 1 matroid and k linear ("knapsack") constraints. 

Problem definition. Given a monotone submodular function f : 2 N — > (by a value oracle), and a matroid 
A4 = (N, 2) (by an independence oracle). For each i £ N, we have k parameters Cij, 1 < j < k. A set S C N 
is feasible if S G X and Yli^s c *i — ^f or eacn 1 < j < The goal is to maximize f over all feasible sets. 

Kulik et al. gave a (1 — 1/e — e)-approximation for the same problem with a constant number of linear 
constraints, but without the matroid constraint |fl8l . Gupta, Nagarajan and Ravi [15] show that a knapsack 
constraint can in a technical sense be simulated in a black-box fashion by a collection of partition matroid con- 
straints. Using their reduction and known results on submodular set function maximization subject to matroid 
constraints IPT21I2T1 . they obtain a l/(p + q + 1) -approximation with p knapsacks and q matroids for any q > 1 
and fixed p > 1 (or l/(p + q + e) for any fixed p > 1, q > 2 and e > 0). 

5.1 Constant number of knapsack constraints 

We consider first 1 matroid and a constant number k of linear constraints, in which case each linear constraint 
is thought of as a "knapsack" constraint. We show a (1 — 1/e — e) -approximation in this case, building upon 
the algorithm of Kulik, Shachnai and Tamir [18], which works for k knapsack constraints (without a matroid 
constraint). The basic idea is that we can add the knapsack constraints to the multilinear optimization problem 

max{F(x) : x £ P(M)} 

which is used to achieve a (1 — l/e)-approximation for 1 matroid constraint [6j. Using standard techniques 
(partial enumeration), we get rid of all items of large value or size, and then scale down the constraints a little bit, 
so that we have some room for overflow in the rounding stage. We can still solve the multilinear optimization 
problem within a factor of 1 — 1/e and then round the fractional solution using randomized swap rounding 
(or pipage rounding). Using the fact that randomized swap rounding makes the size in each knapsack strongly 
concentrated, we conclude that our solution is feasible with constant probability. 

Algorithm. 

• Assume < e < l/(4fc 2 ). Enumerate all sets A of at most 1/e 4 items which form a feasible solution. 
(We are trying to guess the most valuable items in the optimal solution under a greedy ordering.) For 
each candidate set A, repeat the following. 

• Let M.' = A4 j A be the matroid where A has been contracted. For each 1 < j < k, let Cj = I — YlieA c *i 
be the remaining capacity in knapsack j. Let B be the set of items i ^ A such that either > £ 4 f(A) 
or > ke 3 Cj for some j (the item is relatively big compared to the size of some knapsack). Throw 
away all the items in B. 

• We consider a reduced problem on the item set N \ (A U B), with the matroid constraint M.' , knapsack 
capacities Cj, and objective function g(S) = Ja{S). Define apolytope 

P' = {x e P(M') : Vj; ^ cya* < Cj } (3) 

where P(Ai') is the matroid polytope of M.' . We solve (approximately) the following optimization 
problem: 

max {G(x) : x G (1 - e)P'} (4) 
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where G(x) = E[g(£)] is the multilinear extension of g(S). Since linear functions can be optimized over 
P' in polynomial time, we can use the continuous greedy algorithm OBI to find a fractional solution x* 
within a factor of 1 — 1/e of optimal. 

• Given a fractional solution x* , we apply randomized pipage rounding to x* with respect to the matroid 
polytope P(Ai'). Call the resulting set Ra- Among all candidate sets A such that A U Ra is feasible, 
return the one maximizing f(A U Ra)- 

We remark that the value of this algorithm (unlike the (1 — 1/e) -approximation for 1 matroid constraint) is 
purely theoretical, as it relies on enumeration of a huge (constant) number of elements. 

Theorem 5.1. With constant positive probability, the algorithm above returns a solution of value at least 
(l-l/e-3e)OPT. 

Proof. Consider an optimum solution O, i.e. OPT = f(0). Order the elements of O greedily by decreasing 
marginal values, and let A C O be the elements whose marginal value is at least e 4 OPT. There can be at most 
1/e 4 such elements, and so the algorithm will consider them as one of the candidate sets. We assume in the 
following that this is the set A chosen by the algorithm. 

We consider the reduced instance, where M.' = M./A and the knapsack capacities are Cj = 1 — YlieA c ij- 
O \ A is a feasible solution for this instance and we have g(0 \ A) = /a(0\A) = OPT - f(A). We 
know that in O \ A, there are no items of marginal value more than the last item in A. In particular, /a(z) < 
^f(A) < e 4 OPT for all i G O \ A. We throw away all items where /^(i) > e 4 f(A) but this does not 
affect any item in O \ A. We also throw away the set B C N \ A of items whose size in some knapsack is 
more then ke 3 Cj. In O \ A, there can be at most l/(ke 3 ) such items for each knapsack, i.e. 1/e 3 items in 
total. Since their marginal values with respect to A are bounded by e 4 OPT, these items together have value 
g [0 n B) = f A (0 n B) < eOPT. O' = O \ (A U B) is still a feasible set for the reduced problem, and using 
submodularity, its value is 

g(O') = g((0 \A)\(On B)) > g(0 \ A) - g(0 n B) > OPT - f(A) - eOPT. 

Now consider the multilinear problem ((U). Note that the indicator vector 1q/ is feasible in P', and hence 
(1 — e)lo' is feasible in (1 — e)P' . Using the concavity of G{x) along the line from the origin to 1q', we have 
G((l - e)l >) > (1 - e)g{0') > (1 - 2e)OPT - f(A). Using the continuous greedy algorithm [35], we find 
a fractional solution x* of value 

G(x*) > (1 - l/e)G((l - e)l o > (1 - 1/e - 2e)OPT - f(A). 

Finally, we apply randomized swap rounding (or pipage rounding) to x* and call the resulting set R. By 
the construction of randomized swap rounding, R is independent in A4' with probability 1. However, R might 
violate some of the knapsack constraints. 

Consider a fixed knapsack constraint, X^ies c «i — Cj- Our fractional solution x* satisfies J2 c ij x *i — 
(1 — e)Cj. Also, we know that all sizes in the reduced instance are bounded by Cij < ke 3 Cj. By scaling, 
c ij = Cij/ikePCj), we can apply Corollary II .21 with p, = (1 — e)/(ke 3 ): 

Pr[^ Cij > CA < Pr[^ 4. > (1 + e)p] < 1/3 < e~ l / 4k£ . 

ieR ieR 

On the other hand, consider the objective function g(R). In the reduced instance, all items have value g{i) < 
e 4 OPT. Let fj, = G(x*)/(e 4 OPT). Then, Theorem O implies 

Pr[g(R) < (1 - S)G(x*)] = Vi[f{R)/{e 4 OPT) < (1 - 5)p] < e" 5 '^ 8 = e ~s 2 G(x*)/8e^OPT 
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We set 5 = q^I^ z and obtain 

Pr[g(R) < G(x*) - eOPT] < e -OPTI^G( x *) < e -i/8e\ 

By the union bound, 

Pr[g(R) < G(x*) - eOPT or 3j; dj > Cj) < e~ 1/8 ' 2 + ke~ 1/4k£ . 

For e < 1/ (4k 2 ), this probability is at most e _2fc4 + ke~ k < 1. If this event does not occur, we have a feasible 
solution of value f(R) = f(A) + g(R) > f(A) + G(x*) - eOPT > (1 - 1/e - 3e)OPT. 

□ 

5.2 Loose packing constraints 

In this section we consider the case when the number of linear packing constraints is not a fixed constant. The 
notation we use in this case is that of a packing integer program: 

max{/(x) : x G P{M),Ax <b,x£ {0, l} n }. 

Here / : 2 N —* R is a monotone submodular function with n = \N\, M = (N,l) is a matroid, A G M^_ x?1 is 
a non-negative matrix and b G is a non-negative vector. This problem has been studied extensively when 
f(x) is a linear function, in other words f(x) = w T x for some non-negative weight vector w G W 1 . Even this 
case with A, b having only 0, 1 entries captures the maximum independent set problem in graphs and hence is 
NP-hard to approximate to within an n 1_e -factor for any fixed e > 0. For this reason a variety of restrictions 
on A, b have been studied. 

We consider the case when the constraints are sufficiently loose, i.e. the right-hand side b is significantly 
larger than entries in A: in particular, we assume bi > c log k ■ maxj Aij for 1 < i < k. In this case, we propose 
a straightforward algorithm which works as follows. 

Algorithm. 

• Let e = -y/6/c. Solve (approximately) the following optimization problem: 

max{F(x) : x G (1 — e)P} 
where F(x) = E[/(x)] is the multilinear extension of f(S), and 

P = {x£ P(M) | Vi; A v x 3 ^ h i\- 

j&N 

Since linear functions can be optimized over P in polynomial time, we can use the continuous greedy 
algorithm [35 ] to find a fractional solution x* within a factor of 1 — 1/e of optimal. 

• Apply randomized pipage rounding to x* with respect to the matroid poly tope P(M). If the resulting 
solution R satisfies the packing constraints, return R; otherwise, fail. 

Theorem 5.2. Assume that A G R fcx ™ and b G M fc such that bi > Analog k for all i,j and some constant 
c = 6/e 2 . Then the algorithm above gives a (1 — 1/e — 0(e)) -approximation with constant probability. 

We remark that it is NP-hard to achieve a better than (1 — l/e)-approximation even when k = 1 and the 
constraint is very loose (Aij = 1 and bi — > oo) ATI . 
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Proof. The proof is similar to that of Theorem l5.ll but simpler. We only highlight the main differences. 

In the first stage we obtain a fractional solution such that F(x*) > (l—e)(l — l/e)OPT. Randomized swap 
rounding yields a random solution R which satisfies the matroid constraint. It remains to check the packing 
constraints. For each i, we have 

j£R j£N 

The variables Xj are negatively correlated and by Corollary 1 1.21 with 5 = e = \/6/c and /i = clog k, 

Pr[^^,>6 J ]< e - 5 V3 = ^. 
jeR 

By the union bound, all packing constraints are satisfied with probability at least 1 — l/k. We assume here that 
k = w(l). By using Theorem 1 1.4[ we can also conclude that the value of the solution is at least (1 — 1/e — 
0(e))OPT with constant probability. □ 

6 Minimax integer programs with a matroid constraint 

Minimax integer programs are motivated by applications to routing and partitioning. The setup is as follows; we 
follow l33l . We have boolean variables Xij for i G [p] and j G [£j\ for integers £x,...,£ p . Let n = Ylie\p] ^- 
The goal is to minimize A subject to: 

• equality constraints: \/i G [p], YljeMA x i,j = 1 

• a system of linear inequalities Ax < Al where A G [0, l] mxn 

• integrality constraints: x,- L j G {0, 1} for all i, j. 

The variables X{j, j G [£i] for each i G [p] capture the fact that exactly one option amongst the £{ options 
in group i should be chosen. A canonical example is the congestion minimization problem for integral routings 
in graphs where for each i, the xi j variables represent the different paths for routing the flow of a pair (sj, U) 
and the matrix A encodes the capacity constraints of the edges. A natural approach is to solve the natural 
LP relaxation for the above problem and then apply randomized rounding by choosing independently for each 
i exactly one j G [£i] where the probability of choosing j G [£j\ is exactly equal to Xij. This follows the 
randomized rounding method of Raghavan and Thompson for congestion minimization |29l and one obtains an 
0(logm/ log log m) -approximation with respect to the fractional solution. Using Lovasz Local Lemma (and 
complicated derandomization) it is possible to obtain an improved bound of 0(log q/ log log q) I23ll33l where 
q is the maximum number of non-zero entries in any column of A. This refined bound has various applications. 

Interestingly, the above problem becomes non-trivial if we make a slight change to the equality constraints. 
Suppose for each i £ [p] we now have an equality constraint of the form Y^j^UA x i,j = ^« where ki is an 
integer. For routing, this corresponds to arequirement of ki paths for pair (si,ti). Now the standard randomized 
rounding doesn't quite work for this low congestion multi-path routing problem. Srinivasan [34], motivated by 
this generalized routing problem, developed dependent randomized rounding and used the negative correlation 
properties of this rounding to obtain an 0(logm/ log log m) -approximation. This was further generalized in 
[14] as randomized versions of pipage rounding in the context of other applications. 

6.1 Congestion minimization under a matroid base constraint 

Here we show that our dependent rounding in matroids allows a clean generalization of the type of constraints 
considered in several applications in ll34l[T4l . Let M. be a matroid on a ground set N. Let B{M.) be the base 
poly tope of M. . We consider the problem 

min {A : 3x G {0, l} N ,x G B(M), Ax < Al} 



11 



where A £ [0, l] m . We observe that the previous problem with the variables partitioned into groups and 
equality constraints can be cast naturally as a special case of this matroid constraint problem; the equality 
constraints simply correspond to a partition matroid on the ground set of all variables X{j. 

However, our framework is much more flexible. For example, consider the spanning tree problem with 
packing constraints: each edge has a weight w e and we want to minimize the maximum load on any vertex, 
max„ G y YleeSM We - ^ ms P r °blem a l so falls within our framework. 

Theorem 6.1. There is an 0(log m/ log log m)-approximation for the problem 

min {A : 3x £ {0, l} N ,x £ B(M), Ax < Al} , 
where m is the number of packing constraints, i.e. A £ [0, l] mxAr . 

Proof. Fix a value of A. Let Z{\) = {j \ 3i; Aij > A}. We can force Xj = for all j £ Z(X), because no 
element j £ Z(X) can be in a feasible solution for A. In polynomial time, we can check the feasibility of the 
following LP: 

P A = {x £ B{M) : Ax < Xl,x\ z{x) = 0} 

(because we can separate over B{M.) and the additional packing constraints efficiently). By binary search, we 
can find (within 1 + e) the minimum value of A such that P\ ^ 0. This is a lower bound on the actual optimum 
Xopt- We also obtain the corresponding fractional solution x*. 

We apply randomized swap rounding (or randomized pipage rounding) to x* , obtaining a random set R. R 
satisfies the matroid base constraint by definition. Consider a fixed packing constraint (the i-th row of A). We 
have 

jeN 

and all entries A^ such that x* > are bounded by A. We set A^ = A^jX, so that we can use Corollary 11.21 
We get 

PrE A* >(1 + S) A] = Pr A« > 1 + 5] < ( (l /g )1+s ) " ■ 

jeR j&R vv ' J 

For fj, = 1 and 1 + 6 = logTogm. ' tms probability is bounded by 

4 log m 4 log m 

r ^-^ / e log log m \ log log "i / 1 \ log log m i 

Pr VAj > (l + g)A < ,f < = — 

f-^ V 41ogm y Vvlog^V m 

for sufficiently large m. Therefore, all m constraints are satisfied within a factor of 1 + 5 = with high 

probability. □ 

We remark that the approximation guarantee can be made an "almost additive" O(logm), in the following 
sense: Assuming that the optimum value is A*, for any fixed e > we can find a solution of value A < 
(1 + e)X* + 0(i logm). Scaling is important here: recall that we assumed A £ [0, l] Nxm . We omit the proof, 
which follows by a similar application of the Chernoff bound as above, with p, = X* and 5 = e + O ( ^ log m). 

Minimum Stabbing and Crossing Tree Problems: Another interesting application of Theorem [67T1 is to the 
minimum stabbing and crossing tree problems. Bilo et al. 0, motivated by several applications, considered the 
crossing spanning tree problem. The input is a graph G = (V, E) and an explit set C of m cuts in G. The goal 
is to find a spanning tree that minimizes the number of edges crossing any cut in C. The algorithm in [4 ] returns 
a tree that crosses any cut in C at most 0((log m + log n)(j* + log n)) times where 7* is the optimal solution 
value; the authors claim an improved bound of 0(7* log n + log m) in a subsequent version of the paper. 
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The minimum stabbing tree problem arises in computational geometry: the input is a set V = {v±, . . . , v n } 
of points in R d ; it is assumed that d is a constant and the case of 2-dimensions is of particular interest. The task 
is to construct a spanning tree on V by connecting vertices with straight lines such that the crossing number, 
which is the maximum number of edges that are intersected by any hyperplane, is minimized. This problem 
was shown to be NP-hard by Fekete et al. iflOl . It is relatively easy to see that the stabbing tree problem 
is a special case of the crossing spanning tree problem; the number of combinatorially distinct cuts induced 
by the hyperplanes is 0(n d ), one for each set of d points that define a hyperplane through them. Thus, the 
result in implies that there is an algorithm for the stabbing tree problem that returns a tree with crossing 
number 0(A* logn) where A* is the tree with the smallest crossing number (note that this is via the improved 
bound claimed by the authors of @ in a longer version). Unaware of the work in flU, HarPeled very recently 
|[T6l gave a polynomial time algorithm for the stabbing tree problem that outputs a tree with crossing number 
0(A* logn + log 2 n/ log log n). 

Both of the above problems can be cast as special cases of the minimization problem presented in The- 
orem 16.11 where M. is the graphic matroid and each row of A corresponds to the incidence vector of a cut. 
Theorem [6j] implies that using dependent randomized rounding, an 0(log n/ log log n)-approximation can be 
obtained for the stabbing tree problem and an 0(logm/loglogm)-approximation for the crossing spanning 
tree problem. The approximation guarantee can be transformed into an almost additive one as well, leading 
to a solution of value A < (1 + e)A* + O(Mogn) for the stabbing tree problem and a solution of value 
7 < (1 + e)7* + 0(- logm) for the crossing spanning tree problem. Note that these additive results imply a 
constant factor approximation if the optimal value is f2(logn) and f2(logm) respectively. 

We remark that the results we obtain for the above problems can also be obtained by the maximum entropy 
sampling approach for spanning trees from [2] ; our algorithms have the advantage of being simpler and more 
efficient. 

6.2 Min-cost matroid bases with packing constraints 

We can similarly handle the case where in addition we want to minimize a linear objective function. An example 
of such a problem would be a multi-path routing problem minimizing the total cost in addition to congestion. 
Another example is the minimum-cost spanning tree with packing constraints for the edges incident with each 
vertex. We remark that in case the packing constraints are simply degree bounds, strong results are known 
- namely, there is an algorithm that finds a spanning tree of optimal cost and violating the degree bounds by 
at most one l; 32l . In the general case of finding a matroid base satisfying certain "degree constraints", there 
is an algorithm iMTl that finds a base of optimal cost and violating the degree constraints by an additive error 
of at most A — 1, where each element participates in at most A constraints (e.g. A = 2 for degree-bounded 
spanning trees). The algorithm of ifTTl also works for upper and lower bounds, violating each constraint by at 
most 2A — 1. See 1 17] for more details. 

We consider a variant of this problem where the packing constraints can involve arbitrary weights and 
capacities. We show that we can find a matroid base of near-optimal cost which violates the packing constraints 
by a multiplicative factor of O (log m/ log log m), where m is the total number of packing constraints. 

Theorem 6.2. There is a (1 + e, 0(log m/ log log m))-bicriteria approximation for the problem 

min {c T x : x G {0, l} N ,x € B(M),Ax < b] , 

where A 6 [0, l\ mxN and b 6 R^; the first guarantee is w.r.t. the cost of the solution and the second guarantee 
w.r.t. the overflow on the packing constraints. 

Proof. We give a sketch of the proof. First, we throw away all elements that on their own violate some packing 
constraint. Then, we solve the following LP: 

min {c T x : x £ B(M), Ax < b] . 
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Let the optimum solution be x*. We apply randomized swap rounding (or randomized pipage rounding) to x*, 
yielding a random solution R. Since each of the to constraints is satisfied in expectation, and each element 
alone satisfies each packing constraint, we get by the same analysis as above that with high probability, R 
violates every constraint by a factor of O (log to/ log log m). 

Finally, the expected cost of our solution is c T x* < OPT. By Markov's inequality, the probability that 
c(R) > (l + e)OPTisatmostl/(l + e) < l-e/2. With probability at least e/2-o(l), c(R) < (l + e)OPT 
and all packing constraints are satisfied within O (log to/ log log to). □ 

Let us rephrase this result in the more familiar setting of spanning trees. Given packing constraints on 
the edges incident with each vertex, using arbitrary weights and capacities, we can find a spanning tree of 
near-optimal cost, violating each packing constraint by a multiplicative factor of O (log to/ log log m). As in 
the previous section, if we assume that the weights are in [0,1], this can be replaced by an additive factor of 
0(- log to) while making the multiplicative factor 1 + e (see the end of Section loTTT) . 

In the general case of matroid bases, our result is incomparable to that of ifTTl . which provides an additive 
guarantee of A — 1. (The assumption here is that each element participates in at most A degree constraints; 
in our framework, this corresponds to A £ {0, l} mxiV with A-sparse columns.) When elements participate in 
many degree constraints (A 3> log to) and the degree bounds are 6, = 0(log m), our result is actually stronger 
in terms of the packing constraint guarantee. 

Asymmetric Traveling Salesman and Maximum Entropy Sampling: In a recent breakthrough, [2] ob- 
tained an 0(logn/ log log n) -approximation for the ATSP problem. A crucial ingredient in the approach is to 
round a point x in the spanning tree polytope to a tree T such that no cut of G contains too many edges of T, 
and the cost of the tree is within a constant factor of the cost of x. For this purpose, d uses the maximum 
entropy sampling approach which also enjoys negative correlation properties and hence one can get Chernoff- 
type bounds for linear sums of the variables; moreover T contains each edge e with probability x e . We note 
that the number of cuts is exponential in n. To address this issue, uses Karger's result on the number of cuts 
in a graph within a certain weight range: assuming that the minimum cut is at least 1, there are only 0(n 2a ) 
cuts of weight in (a/2, a] for any a > 1. Maximum entropy sampling is technically quite involved and also 
computationally expensive. Our rounding procedures can be used in place of maximum entropy sampling to 
simplify the algorithm and the analysis in (2). 

7 Multiobjective optimization with submodular functions 

In this section, we consider the following problem: Given a matroid M. = (N, X) and k monotone submodular 
functions fi, ■ ■ ■ , fk '■ 2 N — > M+» in what sense can we maximize fi(S), . . . , fk{S) simultaneously over 
Ssl? This question has been studied in the framework of multiobjective optimization, popularized in the CS 
community by the work of Papadimitriou and Yannakakis |[27l . The set of all solutions which are optimal with 
respect to fi(S), . . . , fk(S) is captured by the notion of a pareto set: the set of all solutions S such that for 
any other feasible solution S' , there exists i for which fi(S') < fi(S). Since the pareto set in general can be 
exponentially large, we settle for the notion of a e-approximate pareto set, where the condition is replaced by 
fi(S') < (1 + e)fi(S). Papadimitriou and Yannakakis show the following equivalence ll27l Theorem 2]: 

Proposition 7.1. An e-approximate pareto set can be found in polynomial time, if and only if the following 
problem can be solved: Given (V\, . . . , V/%), either return a solution with fi(S) > Vifor all i, or answer that 
there is no solution such that fi(S) > (1 + e)Vifor all i. 

The latter problem is exactly what we address in this section. We show the following result. 

Theorem 7.2. For any fixed e > and k > 2, given a matroid A4 = (N,T), monotone submodular functions 
fl , . . . , fk : 2 — > R-|_, and values Vi,...,Vk £ M.+, in polynomial time we can either 
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• find a solution S £ I such that fi(S) > (1 — 1/e — e)Vifor all i, or 

• return a certificate that there is no solution with fi(S) > Vifor all i. 

If fi(S) are linear functions, the guarantee in the first case becomes fi(S) > (1 — e)V{. 

This together with Proposition 17 .ll implies that for any constant number of linear objective functions subject 
to a matroid constraint, an e-approximate pareto set can be found in polynomial time. (This was known in the 
case of multiobjective spanning trees (271.) Furthermore, a straightforward modification of Prop. ITT1 (see [27], 
Theorem 2) implies that for monotone submodular functions fi(S), we can find a (1 — 1/e — e)-approximate 
pareto set. 

Our algorithm requires a modification of the continuous greedy algorithm from P5l HI. We show the 
following, which might be useful in other applications as well. In the following lemma, we do not require k to 
be constant. 

Lemma 7.3. Consider monotone submodular functions /i , • ■ • , /& '■ 2 N — ► K+, their multilinear extensions 
Fi(x) = E[fi(x)] and a down-monotone polytope P C such that we can optimize linear functions over P 
in polynomial time. Then given V\ , . . . , £ M+ we can either 

• find a point x £ P such that Fi(x) > (1 — l/e)Vifor all i, or 

• return a certificate that there is no point x £ P such that Fi(x) > Vifor all i. 

Proof. We refer to Section 2.3 of (6l for intuition and notation. Assuming that there is a solution S £ X 
achieving fi(S) > V%, Section 2.3 in [6] implies that for any fractional solution y £ P{M.) there is a direction 
v*(y) £ P(Ai) such that v*(y) • VFi(y) > Vi — Fi{y). Moreover, the way this direction is constructed is 
by going towards the actual optimum - i.e., this direction is the same for all i. Assuming that such a direction 
exists, we can find it by linear programming. If the LP is infeasible, we have a certificate that there is no solution 
satisfying fi(S) > Vi for all i. Otherwise, we follow the continuous greedy algorithm and the analysis implies 
that 

(IF- 

-£>v*{y{t))-VF{y{t))>V-F % {y(t)) 

which implies Fi(y(l)) > (1 - l/e)^. □ 

Given Lemma 1731 we sketch the proof of Theorem 17.21 as follows. First, we guess a constant number of 
elements so that for each remaining element j, the marginal value for each i is at most e 3 V{. In the following, we 
just assume that f%{j) < e 3 V^ for all i, j. For each objective function fi, we consider the multilinear relaxation 
of the problem: 

max{Fj(x) : x £ P(M)} 

where Fi(x) = E[/j(x)]. We apply Lemma 1731 to find a fractional solution y* satisfying Fi(y*) > (1 — l/e)Vi 
for all i (or a certificate that there is no solution y £ P{M.) such that Fi(y) > Vi for all i; this implies that there 
is no feasible solution S such that fi(S) > Vi for all i). For linear objective functions, the problem is much 
simpler: then Fi(x) are linear functions and we can find a fractional solution satisfying Fi(y*) > Vi directly by 
linear programming. 

We apply randomized swap rounding to y*, to obtain a random solution R £ X satisfying the lower-tail 
concentration bound of Theorem 1 1.41 The marginal values of fi are bounded by e 3 V, so by standard scaling 
we obtain 

Pi\fi(R) < (1 - S)F i (y*)} < g-^W)/^ < e -P/iee'\ 

Hence, we can set 5 = e and obtain error probability at most e~ l / 1&£ . By the union bound, the probability that 
fi(R) < (1 — e)Fi(y*) for any i is at most /ce _1 / 16e . For sufficiently small e > 0, this is a constant probability 
smaller than 1. Then, fi(R) > (1 — 1/e — e)Vi for all i. This proves Theorem l7.2l 
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To conclude, we are able to find a (1 — 1/e — e)-approximate pareto set for any constant number of mono- 
tone submodular functions and any matroid constraint. This has a natural interpretation in the setting of the 
Submodular Welfare Problem (which is a special case, see Ifl2l l22lD . Then each objective function fi(S) is 
the utility function of a player, and we want to find a pareto set with respect to all possible allocations. To 
summarize, we can find a set of all allocations that are not dominated by any other allocation within a factor of 
1 — 1/e — e per player. 
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A Randomized pipage rounding 

Let us summarize the pipage rounding technique in the context of matroid poly topes [5,6]. The basic version of 
the technique assumes that we start with a point in the matroid base polytope, and we want to round it to a vertex 
of B(M). In each step, we have a fractional solution y G B(M) and a tight set T (satisfying y{T) = r(T)) 
containing at least two fractional variables. We modify the two fractional variables in such a way that their sum 
remains constant, until some variable becomes integral or a new constraint becomes tight. If a new constraint 
becomes tight, we continue with a new tight set, which can be shown to be a proper subset of the previous tight 
set lHHH. Hence, after n steps we produce a new integral variable, and the process terminates after n 2 steps. 

In the randomized version of the technique, each step is randomized in such a way that the expecta- 
tion of each variable is preserved. Here is the randomized version of pipage rounding [6]. The subroutine 
HitConstraint(y, i, j) starts from y and tries to increase yi and decrease yj at the same rate, as long as the 
the solution is inside B(A4). It returns a new point y and a tight set A, which would be violated if we go any 
further. This is used in the main algorithm PipageRound(.M, y), which repeats the process until an integral 
solution in B{M.) is found. 

Subroutine HitConstraint(y, i, j): 
Denote A = {A C X : i £ A, j ' ^ A}; 
Find 5 = mm A€A (r M (A) - y(A)) 

and a set A G A attaining the above minimum; 
If Vj < 5 then {5 <- Vj , A <- {j}}; 
V%^- Vi + 8, yj *- yj - 5; 
Return (y, A). 

Algorithm PipageRound((A4, y)): 
While (y is not integral) do 

T «- X; 

While (T contains fractional variables) do 
Pick i, j G T fractional; 
(y + ,A + ) <— Hit Constraint (y, i, j); 
(y~,A~) <— Hit Constraint (y, j, i); 

p <- \\y + - y\\/\\y + - y"l|; 

With probability p, {y <- y~, T <- T n A~}\ 
Else {y<-y + ,T<-TC\A+}\ 

EndWhile 
EndWhile 
Output y. 

Subsequently [36], pipage rounding was extended to the case when the starting point is in the matroid 
polytope P(M), rather than B{M). This is not an issue in (6), but it is necessary for applications with non- 
monotone submodular functions P6l or with additional constraints, such as in this paper. 
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The following procedure takes care of the case when we start with a fractional solution x G P(M.). It 
adjusts the solution in a randomized way so that the expectation of each variable is preserved, and the new 
fractional solution is in the base polytope of a (possibly reduced) matroid. 

Algorithm Adjust((.M, x))\ 
While (x is not in B(M)) do 
If (there is i and 5 > such that x + 5ei G P(M)) do 
Let x m ax = %i + max{(5 : x + 5ei G i-'(.M)}; 

Let p = Xi/ Xmax > 

With probability p, {xj <- x max }\ 
Else {xj <— 0}; 

Endlf 

If (there is i such that Xi = 0) do 
Delete i from and remove the i-coordinate from x. 

Endlf 
EndWhile 
Output (M,x). 

To summarize, the complete procedure works as follows. For a given x G P(A4), we run (Ai r , y) :=Adjust(A4, 
followed by PipageRound((A / f / , y)). The outcome is a base in the restricted matroid where some elements have 
been deleted, i.e. an independent set in the original matroid. 

B Proofs and generalizations for randomized swap rounding 

In this section we proof that randomized swap rounding satisfies the conditions of Lemma |4~T1 and generalize 
the procedure to points in the matroid polytope. 

B.l Proof of conditions for negative correlation 

Lemma B.l. Randomized swap rounding satisfies the conditions of Lemma \4.1\ 

Proof. Let Xi ± denote the i-th component of X t . To prove the first condition of Lemma |4~T1 we condition on 
a particular vector X t at time t of the process and on its convex representation X t = X^=i Pi^-B v The vector 
X t +i is obtained from ~K t by an elementary operation. Without loss of generality we assume that the elementary 
operation does a swap between the bases B\ and B>2 involving the elements i G B\ \ Bi and j ^ B^\B\. Let 
B[ and B' 2 be the bases after the swap. Hence, with probability Pi /(Pi + P2), B[ = Bi and B' 2 = B2 — j + i, 
and with probability p 2 /(Pi + P2), B[ = B\ - i + j and B' 2 = B 2 . Thus, 

E[/3il^ + P2l B 'J = 7rr7T^ llB i + ft (^2 - e i + e *)) + TTTW^ 1 ^ ~ e * + e i) + ^b 2 ) 
Pi + P2 Pi+ P2 

where ej = and = \^ denote the canonical basis vectors corresponding to element i and j, respec- 
tively. Since the vector X t+ i is given by X m = Pil B [ +P2^B! 2 + J2e=3 Pe^B/,, we obtain E[X m | X t ] = X t . 
The second condition of Lemma 14.11 is satisfied since an elementary operation only changes two elements in 
one base of the convex representation as discussed above. To check the third condition of the lemma, assume 
without loss of generality that X t+ i is obtained from ~K t = Yle=i Pi~^B t by replacing Bi by Bi — i + j. Hence, 
Xi^+i = Xij + Pi and Xj tt +i = Xj jt — Pi, implying that the third condition of the lemma is satisfied. □ 
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B.2 Adapting randomized swap rounding to points in the matroid polytope 

In this section we show how randomized swap rounding can be generalized to round a point in the matroid 
polytope to an independent set, such that the conditions of Lemma 14.11 are still satisfied. We first present a 
generalization where the rounding is done by applying randomized swap rounding for base polytopes to an 
extension of the underlying matroid. In a second step we show that this procedure can easily be interpreted as 
a procedure on the initial matroid, leading to a simpler description of the method. An advantage of presenting 
the method as a special case of base rounding, is that results presented for randomized swap rounding on base 
polytopes easily carry over to the general rounding procedure. 

Let x G P(M) be the point to round. Similar as for the base polytope case, we need a representation 
of x as a convex combination of independent sets. Again, the algorithm of Cunningham [8] can be used to 
obtain a convex combination of x using at most n + 1 independent sets with a running time which is bounded 
by 0(n 6 ) oracle calls. Thus, we assume that such a convex combination of x using n + 1 independent sets 
h, . . . , I n+ i G 1 is given, i.e., x = Y%=i h^-h- 

Let At' = (N',Z') be the following extension of the matroid At = (N,Z). The set N' is obtained from 
N by adding d additional dummy elements {s\, . . . , Sd], N' = N U {si, . . . , s^}. The independent sets are 
defined by X' = { I C N' \ I n N G I, \ I\ <d}. Thus, a base of M is also a base of M'. The task of rounding 
x in At can be transformed into rounding a point in the base polytope of At' as follows. Every independent set 
If that is used in the convex representation of x, is extended to a base B' e of At' by adding an arbitrary subset 
of {si, . . . , Sd} of cardinality d — Hence, y = X^"=i &sXb' is a point in the base polytope of Ai' . Then 
the randomized swap rounding procedure as presented in Section [3] for points in the base polytope is used to 
get a point 1b' in B(At'). The point 1b> is finally transformed into a point x that is a vertex of P(Ai) by 
projecting 1 b' onto the components corresponding to elements in N. The point x is returned by the algorithm. 
By Lemma IbTTT the random point 1b> satisfies the conditions of Lemma [47X1 Since the projection does not 
change the distribution of the components of 1b>, also x satisfies the same properties. 

The dummy elements can be interpreted as elements that do not have any influence in the final outcome, 
since they will be removed by the projection. Consider for example an elementary operation on two bases 
B[, B' 2 G B which are extensions of two independent set ii, I2 £ T to the matroid Ai', and let i G B[ \ B' 2 and 
j G B' 2 \ B[ be the two elements involved in the swap. If i is a dummy element, i.e., i G {si, . . . , Sd}, then 
replacing B' 2 by B 2 — j + i corresponds to removing element j from I2. 

Consider the above algorithm using dummy elements with the following modification: At each elementary 
operation, if possible, two non-dummy elements are chosen. One can easily observe that describing this version 
of the algorithm without dummy elements corresponds to replacing the MergeBases procedure with the follow- 
ing procedure to merge two independent sets. The procedure, called MergelndepSets, takes two independent 
sets Ii , I2 G X and two positive scalars (3\ , P2 as input. To simplify the description of the procedure, we assume 
I h I > \h\, otherwise the roles of l\ and I2 have to be exchanged in the algorithm. 

Algorithm MergeIndepSets(/3i , Ii,(32,h)'- 
Find a set S C Ii \ I 2 of cardinality |ii| — | J2 1 such that I2 U S G 1; 
I 2 = I 2 U S; 
While (ii ^ I' 2 ) do 
Pick i G h \ I 2 and find j G I 2 \ h such that h - i + j G 1 and I 2 — j + 
With probability ft / (ft + p 2 ) , {/£ <- I' 2 - j + i} \ 
Else {h <- ii -i + j}; 

EndWhile 

For (i G S) do 
With probability ft /(ft + ft), {h <- h - i}; 
EndFor 
Output I\ . 

The existence of a set 5 as used in the algorithm easily follows from the matroid axioms l30l . It can be found 
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by successively choosing elements in l\ \ I2 that can be added to I2 still maintaining independence. Once the 
element i G Ii\I f 2 is chosen in the while loop of the algorithm, the existence of an element j ' G I 2 \ I\ satisfying 
Il — i + jEl and V 2 — j + i G X is guaranteed by applying Theorem 12. 11 to the matroid M.' = (N, T') given 
by T = {I GX| |/| < \h\}. 

C Chernoff bounds for submodular functions 

Here we prove Theorem ll.3l a Chernoff-type bound for a monotone submodular function f(X\ , . . . , X n ) where 
X\, . . . , X n G {0, 1} are independent random variables. Similarly to the proof of Chernoff bounds for linear 
functions, the main trick is to prove a bound on the exponential moments E[e x f( Xl '---> Xn ^]. For that purpose, 
we write the value of f(X\, . . . , X n ) as follows: f(Xi, . . . , X n ) = Ya=i ^> wnere 

Y i = f(X 1 ,...,X i ,0,...,0)-f(X 1 ,...,X i _ l ,0,...,0). 

The new complication is that the variables Y\ are not independent. There could be negative and even positive 
correlations between Yi,Yj. What is important for us, however, is that we can show negative correlation between 
e A Ei=i Yi and e AYfe , and by induction the following bound. 

Lemma C.l. For any A G M, a monotone submodular function and Y\, . . . ,Y n defined as above, 

n 

E[e A£?=i^]< J] E [e Ay *]. 
i=i 

Proof. Denote pi = Pr[Xj = 1]. For any k, we have 

E [ e AE*U^] = E [ e A/(X ll ...,X fc ,0,-,0)] 

= p k E[ e V(^,...,^ fc -i,i,...,o)] + (1 _ pfe)E[e A/(x lj ...,x fc _ 1 ,o,..,o) ] 

= p k E[e Xf ( Xl '-' Xk ~ 1 ' '-> 0) e XFk ( Xl >-' Xk - 1 ' '-'°' ) } + (1 - Pfe )E[e A/(Xl '-' X '=- 1 ' '-' 0) ] 

where 

F k (X 1 ,...,X k . 1 ,0,...,0) = f(X 1 ,...,X k . 1 ,l,...,0)-f(X 1 ,...,X k . 1 ,0,...,0) 

denotes the marginal value of X k being set to 1, given the preceding variables. Observe that E[F k (X\ , . . . , X k _\ , 
0,...,0)]=E[Y k \X k = l]. 

By submodularity, F k is a decreasing function of (Jfi, . . . , On the other hand, J2i=i ^1 = /(^l> 

. . . , 0, . . . , 0) is an increasing function of (Xl, . . . , We get the same monotonicity properties for 

the exponential functions e A ^' and e AFfc ^' ^ (with a switch in monotonicity for A < 0). By the FKG inequality, 
e A/(Xi,...,x fc _ 1 ,o,...,o) and e AF fe (Xi,...,x fc _i,o,...,o) ^ nega tively correlated, and we get 

E j e A/(X 1 ,...,X fc _ 1 ,0,-,0) e Ai^(X ll ...,X fc _ 1 ,0,...,0)j < E j e A/(X 1 ,...,X fc _ 1 ,0,...,0)j E ^AF fc (X ll ...,X fc _ 1 ,0,...,0)j 

= E[e^« y ']E[e«|^ = l]. 

Hence, we have 

E[e A Eta Yi] < pfe E^^ti 1 ^] E [e Ay * | X k = 1] + (1 - p fe )E[e A ^=i *] 

X fe = l] + (l-p fc )-l) 
X fc = l] + (l- Pfc )E[e Ay * |X fc = 0]) 

□ 



= E[e A £ti y ]> fc E[e Ay * 
= E[e x ^ Y >]-(p k E[e XYk 
= E [e A fe ly ].E[e A ^]. 

By induction, we obtain the lemma. 
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Given this lemma, we can finish the proof of Theorem 11.31 following the same outline as of proof of the 
Chernoff bound. 

Proof. Let Yi = f(Xi, ... , X k , 0, . . . , 0) - f{X 1 , ... , 0, . . . , 0) as above. Let us denote E[Yj] = uii and 

fi = Y17=l u i = . . . , X n )]. By the convexity of the exponential and the fact that Yi G [0, 1], 

E[e Ay >] < u ie x + (1 - cot) = 1 + (e A - < e^- 1 '"". 

Lemma ICTTI then implies 

n 

E[e Xf(x 1 ,...,x n ) ]=E[e xJ2UY i] < Y[E[e XY >] < e^- 1 ^. 

8=1 

For the upper-tail bound, we use Markov's inequality as follows: 

. . . , X n ) > (1 + S)A = Pr[^.-^) > < i J < 

We choose e A = 1 + 5 which yields 

Pi[f(X 1: ...,X n ) > (1 + <5)m] < 



(1 +«*)(!+*)/*■ 

For the lower-tail bound, we use Markov's inequality with A < as follows: 

pr„A/(Xi,...,Xn)l p (e A -l)/i 

PrI/CX!, . . . , X n ) < (1 - = Pr[eW-^) > e^"] < [ J < -j^. 

We choose e A = 1 — S which yields 



Pr[/(X 1; . . . ,X n ) < (1 - g)A < (1 < e 



e - M 5 2 /2 



usin 



ing (1 - 5) 1 - 5 > e- s+s2 ' 2 for 5 G (0, 1]. □ 



D Lower-tail estimate for submodular functions under dependent rounding 

In this section, we prove Theorem 11.41 i.e. an exponential estimate for the lower tail of the distribution of a 
monotone submodular function under randomized swap rounding. We note that the bound on the expected 
value of the rounded solution, E[/(i?)] > follows by the convexity of F(x) along directions — just 
like in O; we omit the details. The exponential tail bound is much more involved. We start by setting up some 
notation. 



Notation. The rounding procedure starts from a convex linear combination of bases, 

n 

x =^2Pii Bv 

The rounding proceeds in stages, where in the first stage we merge the bases B\ , B2 (randomly) into a new base 
C2, and replace PiIbi + P2I-B2 m tne linear combination by 72lc* 2 > w i tn 72 = Pi + A- More generally, in 
the A;-th stage, we merge C k and B k +\ into a new base C k +\ (we set C\ = B\ in the first stage), and replace 
7 fc lc fe +/3 k+1 l Bk+1 in the linear combination by j k +i^C k+1 - Inductively, j k+ i = y k +(3 k+1 = Pi- After 

n — 1 stages, we obtain a linear combination 7 ra lc„ an d 7n = Y^i=i Pi = 1> ie > trus * s an integer solution. 
We use the following notation to describe the vectors produced in the process: 
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• hi = Pil Bi 



• x fe = c fc+ i + y k+2 = 7fe+ilc fc+ i + Yn=k+2 P^Bi 

In other words, bj are the initial vectors in the linear combination, which get gradually replaced by Cj, and 
is the fractional solution after k stages. 

We emphasize that x& denotes the entire fractional solution at a certain stage and not the value of its fc-th 
coordinate. The coordinates of the fractional solution are the variables X; L . If we want to refer to the value of 
Xi after k stages, we use the notation X^. 

We work with the multilinear extension of a submodular function, -F(x) = E[/(x)]. In the following, we 
use the following shorthand notation and basic properties: 

• Fi (x) denotes the partial derivative J^- evaluated at x. The interpretation of Fj (x) is the marginal value 
of % with respect to the fractional solution x. 

• We use ej = to denote the canonical basis vector corresponding to element i. 

• If only one variable is changing while others are fixed, F(x) is a linear function. Therefore, we can use 
the following formula: 

F(x + tei) = F(x)+£Fi(x). 

• Due to submodularity, qx qx — ® ^ or an y ^ ms i m P nes tnat ^j( x ) = Jx~ * s non-increasing as a 
function of each coordinate of x. If y dominates x in all coordinates (x < y), we have Fj(x) > F(y). 

Proof overview. The random process in terms of the evolution of F(x) is a submartingale, i.e. the value 
in each step can only increase in expectation. This is a good sign; however, a straightforward application of 
concentration bounds for martingales yields a dependency of the number of variables n which would render the 
bound meaningless. More refined bounds for martingales rely on bounds on the variance in successive steps. 
Unfortunately, these are also difficult to use since we do not have a good a priori bound on the variance in each 
step. The variance can depend on preceding steps and taking worst-case bounds leads to the same dependency 
on n as mentioned above. 

In order to prove a bound which depends only on the parameters 5 and /j, , we start from scratch and follow 
the standard recipe: estimate the exponential moment Efe^ 0- ^^)], where is the initial value and R is the 
rounded solution. We decompose the expression e x ^°~~^ R ^ into a telescoping product: 

e \Qto-f(R)) = e A(F(x )-F(x„_ 1 )) = e A(F(xo)-F(xi)) . gA(F(xi)-F(x a )) . _ _ _ . e A(F(x n _ 2 )-F(x„_ 1 )) _ 

The factors in this product are not independent, but we can prove bounds on the conditional expectations 
j^ g A(F(x fc _i)-F(x fc )) | ,Xfc_i], in other words conditioned on a given history of the rounding process. 

These bounds depend on the history, but we are able to charge the arising factors to the value of /jLq = F(xq) in 
such a way that the final bound depends only on 

We start from the bottom, by analyzing the basic rounding step for two variables. The following elementary 
inequality will be helpful. 

Lemma D.l. For any p G [0, 1] and £ <E [—1, 1], 

pe«i-p) + (1 -p)e-& < e ^- p \ 
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Proof. If £ < 0, we can replace £ by — £ and p by 1 — p; the statement of the lemma remains the same. So we 
can assume £ £ [0, 1]. 

Fix any p £ [0, 1] and define <^(£) = ^ 2 p0--p) - p ^0-~p) - (1 - p)e~^ p . It is easy to see that <f) p (0) = 0. 
Our goal is to prove that 4>p(0 > for £ £ [0, 1]. Let us compute the derivative of </> p (£) with respect to £: 

<^(£) = 2£p(l - p )e* 3 P(i-jO _ p (i _ p)ef(l-p) + p(l - p) e ~5P 
= p(l - p)e _ * p (2£e ?2p(1 - p)+5p - + 1 



> p(l -p)e" fp (2£ - e f + 1 



For £ £ [0, 1], we have < 1 + 2£ and hence > 0. This means that 4> p (£) is non-decreasing and 

p (£)>Ofor££ [0,1]. ' ' □ 

Note that the lemma does not hold for arbitrarily large £, e.g. when p = l/£ 2 and £ — ► oo. Next, we apply 
this lemma to the basic step of the rounding procedure. 

Lemma D.2. Let F{x) be the multilinear extension of a monotone submodular function with marginal values 
in [0, 1], and let A £ [0, 1]. Consider one elementary operation of randomized swap rounding, where two 
variables Xi , Xj are modified. Let x denote the fractional solution before, x' after this step, and let TC denote 
the complete history prior to this rounding step. Assume that the values of the two variables before the rounding 
step are Xi = 7, Xj = (3. Then 

E r e A(F(x)-F(x')) J w j < e A 2 /3 7 (F J (x)-F l (x)) 2 
where Fj(x) = Jy-(x) and Fj(x) = J^-(x). 

Proof. Fix the history TC; this includes the point x before the rounding step. With probability p = jfi^, the 
rounding step is X- = Xj + j3 and Xj = Xj — (3. I.e., x' = x + /tej — (3&j. Since F(x) is linear when only 
one coordinate is modified, we get 

F(x') = F(x) + (3F(x) - (3Fj(x + fa). 

By submodularity, Fj(x + /tej) < Fj(x) and hence 

F(x') = F(x) + /9*i(x) - /3i=)(x + /9ei) > F(x) + /3(i^(x) - F,(x)). 
With probability 1 — p, we set X[ = Xi — 7 and X'- = Xj + 7. By similar reasoning, in this case we get 

F(x') = F(x) - 7 F i (x) + 7 F J (x - 7 e;) > F(x) - j{F{x) - Fj(x)). 
Taking expectation over the two cases, we get 

E [ e A(F(x)-F(x')) J ft] < pe Xp(F j (x)-F i {x.)) + ^ _ p) e -A 7 (i^(x)~.F i (x)) 

_ pgACl-pJGS-HrJ^CxJ-Jifx)) + (1 _ p) e -M/H-7)(^M--Fi(x))_ 

We invoke Lemma|Dj]with £ = A(/3 + 7)(i ? j(x) - Fj(x)) (we have |£| < 1 due to A, /3 + 7, Fi(x),Fj(x) all 
being in [0, 1]). We get 

E j e A(F(x)-F(x')) J ft] < 6 5 2 P(1-P) = e A 2 /3 7 (F J (x)-F l (x)) 2 _ 

□ 
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Note that the exponent on the right-hand side of Lemma ITT21 corresponds to the variance in one step of the 
rounding procedure. The next lemma estimates these contributions, aggregated over one stage of the rounding 
process, i.e., the merging of the bases and The exponent on the right-hand side of Lemma ITJ31 

corresponds to the variance of the random process accumulated over the fc-th stage. It is crucial that we compare 
this quantity to certain values which can be eventually charged to /x . 

Lemma D.3. Let F(x) be the multilinear extension of a monotone submodular function with marginal values 
in [0, 1], and let A G [0, 1]. Consider the k-th stage of the rounding process, when bases C/% and -Bfc+i (with 
coefficients 7^ and 0k+l) are merged into Ck+i- The fractional solution before this stage is x&_i and after this 
stage Xfc. Conditioned on any history 7i of the rounding process throughout the first k—1 stages, 

E[ e A ( F ( x fc-i)- F ( x fc)) I n] < e x2{Pk + lF{ck)+lki > F{yk + l) - Fi > yk + 2))) . 

Proof. The k-th stage merges bases Q. and B^ + i into Ck+i by taking elements in pairs and performing 
rounding steps as in Lemma ID72] Let us denote the pairs of elements considered by the rounding procedure 
(ci, 61), . . . , (q, bd), where Cfc = {ci, . . . , and -Bfc+i = {61, . . . , b^]. The matching is not determined 
beforehand: (c2, 62) might depend on the random choice between c\,b\, etc. In the following, we drop the 
index k and denote by x l the fractional solution obtained after processing (ci, 61), . . . , (c$, 6j). We start with 
x° = xjfc-i and after processing all d pairs, we get x d = x&. We also replace Pk+ijjk simply by /3, 7. We 
denote by Hi the complete history prior to the rounding step involving (cj+i, &i+i); in particular, this includes 
the fractional solution x\ 

Using Lemma |PT2l for the rounding step involving (cj+i, we get 

E [ e A(f(x*)-F(x«-i)) J < e A 2 7 /3(F Ci+1 (x l )-F 6i+1 (x')) 2 < e A 2 7 /3(F Cj+1 (x i )+F 6 . +1 (x*)) ? 

using the fact that the partial derivatives -F)(x 4 ) are in [0, 1]. 

Further, we modify the exponent of the right-hand side as follows. The vector x* is obtained after pro- 
cessing i pairs and still contains the coordinates Cj+i, . . . , Cd of = jlc k untouched: in other words, 
x' > 7 1 {c l+1 ,...,c d }- Let us define 

• = 7l{c !+1 ,..., Cd }- 

I.e., x* > c l > c l+1 . By submodularity, we have F Ci+1 (x*) < F Ci+1 (c* +1 ). 

Similarly, the vector x l also contains the coordinates . . . , b^ of bfc +1 and all of yk+2 = J2j=k+2 
unchanged: x l > /31{6 i+1) ...,& d } + Yk+2- Let us define 

• y i = /3i{fe l+1 ,...,6 d } + yfc+2- 

I.e., x* > y l > y* +1 . By submodularity, we get F;, i+1 (x l ) < Ft, i+1 (y l+1 ). Therefore, we can write 

E [ e A(F(x*)-F(x*+i)) J H .] < e * 2 lP(Fc i+1 (c i+1 )+Fb i+1 (.y i+1 ))_ (5) 

We claim that by induction on d — i, this implies 

E ^ e A(F(x7-F(x d )) J H 1 < e A 2 (/3F(c l )+7(i ? (y l )-F(y d ))) (6) 

for all i = 0, . . . > d. For i = d, the claim is trivial. For i < d,we can write 

E[e A(F(x*)-F(x<*)) 1 ^.j = E ^X(F^)-F^)) E[e MF(^)-F^)) j ^ | ^ 

and using the inductive hypothesis © for i + 1, 

E r e A(F(x«)-F(x<*)) J < E La(F(x»)-F(x'+ 1 )) . e A 2 (/3F(c*+ 1 )+ 7 (F(y*+ 1 )-F(y d ))) | n ~ 



; A 2 (/3F(e+ 1 )+ 7 (F(y l + 1 )-F(y d ))) . g 



25 



where we used the fact that the inductive bound is determined by Hi, and so we can take it out of the expectation 
(it depends only on the sets {cj+2, • . . , Cd} and {6j + 2, • • • , bd] which are determined even before performing 
the rounding step on (cj+i, Taking logs and using (f5]) to estimate the last expectation, we obtain 



logE[e x ^- F ^\n t ] 



J+l- 



< A [(3F(c l+1 ) + 7(^(y i+i ) - F(y a ))) + X^[F Ci+1 (c^) + F bi+1 (y 
= A 2 (/3 (F(c* +1 ) + 7 F Ci+1 (c* +1 )) + 7 (F(y l+1 ) + 0F bi+l (y l+l ) ~ F(y d ) 
= X 2 [(3F(c l ) + l{F(y l )-F(y d ) 

where we used F(c i+l ) + ^F Ci+l (c i+1 )) = F(c*) and F(y i+1 ) + (y l+1 ) = F(y i ) (see the definitions 

of c 1 , y l above). 

This proves our inductive claim For i = 0, since x° = Xfc_i, x d = Xfc, c° = Cfc, y° = yt+i and 
y d = yfc+2, this gives the statement of the lemma. □ 

Now we can proceed finally to the proof of Theorem 1 1.41 

Proof. We prove inductively the following statement: For any k and any A G [0, 1], 

E r e A(w>--F(x fc ))i < e A 2 (Mo(i+Etift+i)-- F (yfc+2)), (7) 

We remind the reader that n$ = F(xq), x^ is the fractional solution after k stages, and y^+2 = Y17=k+2 ^i- 
We proceed by induction on k. 

For k = 0, the claim is trivial, since F(y2) < -F(xo) = /Uo by monotonicity. For > 1, we unroll the 
expectation as follows: 

E[ e A (Mo-^(xfe))] _g e A(^o--F , (xfe-i))g|- e A(F(x fc _ 1 )-F(x fc )) | 

where 7i is the complete history prior to stage k (up to x&_i). We estimate the inside expectation using 
Lemma [D. 31 

E r e A(F(x fc _!)-F( Xfc )) | < e A 2 (/3 fe+1 F(c fc )+ 7fe (F(y fe+1 )-F(y fc+2 ))) < e A 2 (^ fe+1 -F(x fc _ 1 )+-F(y fe+ i)-F(y fc+2 )) 

using monotonicity, c& < Xfc_i, yk+2 < Yfe+i an d 7& < 1. Therefore, 

E r e A(Mo-i ;, (x fc ))j < £ 

_ e A 2 (/3 fe+ iW)+-P(yfc+i)--P(yfc+2)) E 

By the inductive hypothesis © with A' = A - \ 2 flk+i G [0, 1], 

E r e (A-A 2 /3 fc+1 )(Mo-F(x fc _ 1 ))] < e (A-A 2 /3 fc+1 ) 2 (w(l+E -=i A+i)-F(y fe+1 )) < /(wfl+E^ft+O-FfyHi)). 

In the last inequality we used F(yk+i) < /jLq, which holds by monotonicity. Plugging this into the preceding 
equation, 

E r e A( At0 --P(x fc ))i < e A 2 (A + iMo+i ;, (yfc + i)-i ;, (yfc + 2)) e A 2 (M(i+Eti 1 ft+i)-^(yfe+i)) 
= e A 2 ( w (i+E-=ift+i)-^(y fe +2)) 

which proves ©. Finally, for A; = n — 1 we obtain F(x n _i) = f(R) where R is the rounded solution, 
y n+ i = 0, and 

E [ e A(M-/(fl))] < e A 2 M (l+Er=i 1 ft+i) < e 2A 2 W (g) 



e A(A t o-F(x fe _ 1 )) e A 2 (/3 fe+1 F(x fe _ 1 )+F(y fc+1 )-F(y fe+2 )) 



,(A-A 2 & +1 )( M0 -F(x fc _ 1 )) 
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because Y17=i — 1- The final step is to apply Markov's inequality to the exponential moment. From 
Markov's inequality and Equation ([8]), we get 



Pr[/(i?) < (1 - J)^] = Pr 



3 A(/i -/(-R)) > e A5/io 



< 



E[ e A (w -/(«))] 



< e 2A 2 /i -A<5M0 



A choice of A = 5/4 gives the statement of the theorem. 



□ 
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